AITopics | spatial pyramid

Collaborating Authors

spatial pyramid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

37a749d808e46495a8da1e5352d03cae-Reviews.html

Neural Information Processing SystemsOct-3-2025, 08:56:11 GMT

This might unnecessarily alienate the deep learning crowd, given that it has so often been emphasized that all types of modules are suitable for embedding in deep learning architectures, and bits and pieces of computer vision architectures have in fact been used (e.g.

architecture, cc paperinformation reviewerinstruction, spatial pyramid, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

Pyramidal Flow Matching for Efficient Video Generative Modeling

Jin, Yang, Sun, Zhicheng, Li, Ningyuan, Xu, Kun, Xu, Kun, Jiang, Hao, Zhuang, Nan, Huang, Quzhe, Song, Yang, Mu, Yadong, Lin, Zhouchen

arXiv.org Artificial IntelligenceOct-8-2024

Video generation requires modeling a vast spatiotemporal space, which demands significant computational resources and data usage. To reduce the complexity, the prevailing approaches employ a cascaded architecture to avoid direct training with full resolution. Despite reducing computational demands, the separate optimization of each sub-stage hinders knowledge sharing and sacrifices flexibility. This work introduces a unified pyramidal flow matching algorithm. It reinterprets the original denoising trajectory as a series of pyramid stages, where only the final stage operates at the full resolution, thereby enabling more efficient video generative modeling. Through our sophisticated design, the flows of different pyramid stages can be interlinked to maintain continuity. Moreover, we craft autoregressive video generation with a temporal pyramid to compress the full-resolution history. The entire framework can be optimized in an end-to-end manner and with a single unified Diffusion Transformer (DiT). Extensive experiments demonstrate that our method supports generating high-quality 5-second (up to 10-second) videos at 768p resolution and 24 FPS within 20.7k A100 GPU training hours. All code and models will be open-sourced at https://pyramid-flow.github.io.

conference paper, diffusion model, video, (12 more...)

arXiv.org Artificial Intelligence

2410.05954

Country:

North America > United States > Rocky Mountains (0.04)
North America > Canada > Rocky Mountains (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Information Technology (0.46)
Transportation > Ground (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Unsupervised Template Learning for Fine-Grained Object Recognition

Neural Information Processing SystemsMar-14-2024, 14:41:24 GMT

Fine-grained recognition refers to a subordinate level of recognition, such as recognizing different species of animals and plants. It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape and structure shared cross different categories, and the differences are in the details of object parts. We suggest that the key to identifying the fine-grained differences lies in finding the right alignment of image regions that contain the same object parts. We propose a template model for the purpose, which captures common shape patterns of object parts, as well as the cooccurrence relation of the shape patterns. Once the image regions are aligned, extracted features are used for classification. Learning of the template model is efficient, and the recognition results we achieve significantly outperform the stateof-the-art algorithms.

recognition, template, template model, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.15)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context

Wang, Weijun, Howard, Andrew

arXiv.org Artificial IntelligenceDec-21-2021

We present a next-generation neural network architecture, MOSAIC, for efficient and accurate semantic image segmentation on mobile devices. MOSAIC is designed using commonly supported neural operations by diverse mobile hardware platforms for flexible deployment across various mobile platforms. With a simple asymmetric encoder-decoder structure which consists of an efficient multi-scale context encoder and a light-weight hybrid decoder to recover spatial details from aggregated information, MOSAIC achieves new state-of-the-art performance while balancing accuracy and computational cost. Deployed on top of a tailored feature extraction backbone based on a searched classification network, MOSAIC achieves a 5% absolute accuracy gain surpassing the current industry standard MLPerf models and state-of-the-art architectures.

convolution, resolution, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2112.11623

Country: Europe > Germany (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

Zhao, Xiaosa, Liu, Kunpeng, Fan, Wei, Jiang, Lu, Zhao, Xiaowei, Yin, Minghao, Fu, Yanjie

arXiv.org Machine LearningSep-19-2020

Feature selection aims to select a subset of features to optimize the performances of downstream predictive tasks. Recently, multi-agent reinforced feature selection (MARFS) has been introduced to automate feature selection, by creating agents for each feature to select or deselect corresponding features. Although MARFS enjoys the automation of the selection process, MARFS suffers from not just the data complexity in terms of contents and dimensionality, but also the exponentially-increasing computational costs with regard to the number of agents. The raised concern leads to a new research question: Can we simplify the selection process of agents under reinforcement learning context so as to improve the efficiency and costs of feature selection? To address the question, we develop a single-agent reinforced feature selection approach integrated with restructured choice strategy. Specifically, the restructured choice strategy includes: 1) we exploit only one single agent to handle the selection task of multiple features, instead of using multiple agents. 2) we develop a scanning method to empower the single agent to make multiple selection/deselection decisions in each round of scanning. 3) we exploit the relevance to predictive labels of features to prioritize the scanning orders of the agent for multiple features. 4) we propose a convolutional auto-encoder algorithm, integrated with the encoded index information of features, to improve state representation. 5) we design a reward scheme that take into account both prediction accuracy and feature redundancy to facilitate the exploration process. Finally, we present extensive experimental results to demonstrate the efficiency and effectiveness of the proposed method.

machine learning, reinforcement learning, selection, (15 more...)

arXiv.org Machine Learning

2009.0923

Country: North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Unsupervised Template Learning for Fine-Grained Object Recognition

Yang, Shulin, Bo, Liefeng, Wang, Jue, Shapiro, Linda G.

Neural Information Processing SystemsDec-31-2012

Fine-grained recognition refers to a subordinate level of recognition, such are recognizing different species of birds, animals or plants. It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape or structure shared within a category, and the differences are in the details of the object parts. We suggest that the key to identifying the fine-grained differences lies in finding the right alignment of image regions that contain the same object parts. We propose a template model for the purpose, which captures common shape patterns of object parts, as well as the co-occurence relation of the shape patterns. Once the image regions are aligned, extracted features are used for classification. Learning of the template model is efficient, and the recognition results we achieve significantly outperform the state-of-the-art algorithms.

artificial intelligence, machine learning, template, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.15)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback